Regional Variation of Domain-Specific Lexical Items: Toward a Pan-Chinese Lexical Resource
نویسندگان
چکیده
This paper reports on an initial and necessary step toward the construction of a Pan-Chinese lexical resource. We investigated the regional variation of lexical items in two specific domains, finance and sports; and explored how much of such variation is covered in existing Chinese synonym dictionaries, in particular the Tongyici Cilin. The domain-specific lexical items were obtained from subsections of a synchronous Chinese corpus, LIVAC. Results showed that 20-40% of the words from various subcorpora are unique to the individual communities, and as much as 70% of such unique items are not yet covered in the Tongyici Cilin. The results suggested great potential for building a Pan-Chinese lexical resource for Chinese language processing. Our next step would be to explore automatic means for extracting related lexical items from the corpus, and to incorporate them into existing semantic classifications.
منابع مشابه
Toward a Pan-Chinese Thesaurus
In this paper, we propose a corpus-based approach to the construction of a Pan-Chinese lexical resource, starting out with the aim to enrich existing Chinese thesauri in the Pan-Chinese context. The resulting thesaurus is thus expected to contain not only the core senses and usages of Chinese lexical items but also usages specific to individual Chinese speech communities. We introduce the ratio...
متن کاملExtending a Thesaurus with Words from Pan-Chinese Sources
In this paper, we work on extending a Chinese thesaurus with words distinctly used in various Chinese communities. The acquisition and classification of such region-specific lexical items is an important step toward the larger goal of constructing a Pan-Chinese lexical resource. In particular, we extend a previous study in three respects: (1) to improve automatic classification by removing dupl...
متن کاملExtending a Thesaurus in the Pan-Chinese Context
In this paper, we address a unique problem in Chinese language processing and report on our study on extending a Chinese thesaurus with region-specific words, mostly from the financial domain, from various Chinese speech communities. With the larger goal of automatically constructing a Pan-Chinese lexical resource, this work aims at taking an existing semantic classificatory structure as levera...
متن کاملImpact of Density and Distribution of Unfamiliar Lexical Items on Iranian EFL Learners’ Successful Reading Comprehension Achievement
Density and distribution of Unfamiliar Lexical Items (ULIs) appear to influence learners’ Reading Comprehension Achievement (RCA). This study concerns the impact of these two variables on Iranian EFL learners’ RCA. For this, two groups of students timetabled for the experiments designed to assess learners’ RCA. To determine the participants’ levels of proficiency a Quick Proficiency Test was fi...
متن کاملImpact of Density and Distribution of Unfamiliar Lexical Items on Iranian EFL Learners’ Successful Reading Comprehension Achievement
Density and distribution of Unfamiliar Lexical Items (ULIs) appear to influence learners’ Reading Comprehension Achievement (RCA). This study concerns the impact of these two variables on Iranian EFL learners’ RCA. For this, two groups of students timetabled for the experiments designed to assess learners’ RCA. To determine the participants’ levels of proficiency a Quick Proficiency Test was fi...
متن کامل